AITopics

Industry: Energy (0.60)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

arXiv.org Artificial IntelligenceNov-25-2025

Training Emergent Joint Associations: A Reinforcement Learning Approach to Creative Thinking in Language Models

Singh, Mukul, Singha, Ananya, Parab, Aishni, Mehrotra, Pronita, Gulwani, Sumit

Associative thinking--the ability to connect seemingly unrelated ideas--is a foundational element of human creativity and problem-solving. This paper explores whether reinforcement learning (RL) guided by associative thinking principles can enhance a model's performance across diverse generative tasks, including story writing, code generation, and chart creation. We introduce a reinforcement learning framework that uses a prompt-based evaluation mechanism, incorporating established divergent thinking metrics from creativity research. A base language model is fine-tuned using this framework to reward outputs demonstrating higher novelty through higher degrees of conceptual connectivity. Interestingly, the experimental results suggest that RL-based associative thinking-trained models not only generate more original and coherent stories but also exhibit improved abstraction and flexibility in tasks such as programming and data visualization. Our findings provide initial evidence that modeling cognitive creativity principles through reinforcement learning can yield more adaptive and generative AI.

large language model, machine learning, reinforcement learning, (19 more...)

2511.17876

Country: North America > United States > California (0.28)

Genre: Research Report > New Finding (0.86)

Industry: Health & Medicine (0.94)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.34)

Zhang, Eric Hua Qing, Ive, Julia

Context-Emotion Aware Therapeutic Dialogue Generation: A Multi-component Reinforcement Learning Approach to Language Models for Mental Health Support

arXiv.org Artificial IntelligenceNov-18-2025

Mental health illness represents a substantial global socioeconomic burden, with COVID - 19 further exacerbating accessibility challenges and driving increased demand for telehealth mental health support. While large language models ( L LMs) offer promising solutions through 24/7 availability and non - judgmental interactions, pre - trained models often lack the contextual and emotional awareness necessary for appropriate therapeutic responses. This paper investigated the application of supervised fine - tu ning (SFT) and reinforcement learning (RL) techniques to enhance GPT - 2's capacity for therapeutic dialogue generation. The methodology restructured input formats to enable simultaneous processing of contextual information and emotional states alongside user input, employing a multi - component reward function that aligned model outputs with professional therapist responses and annotated emotions. Results demonstrated improvements through reinforcement learning over baseline GPT - 2 across multiple evaluation me trics: BLEU (0.0111), ROUGE - 1 (0.1397), ROUGE - 2 (0.0213), ROUGE - L (0.1317), and METEOR (0.0581). LLM evaluation confirmed high contextual relevance and professionalism, while reinforcement learning achieved 99.34% emotion accuracy compared to 66.96% for baseline GPT - 2. These findings demonstrate reinforcement learning's effectiveness in developing therap eutic dialogue systems that can serve as valuable assistive tools for therapists while maintaining essential human clinical oversight. The code and a ppendic es are publicly available at: https://github.com/ez

large language model, machine learning, reinforcement learning, (18 more...)

2511.11884

Country:

North America > United States (1.00)
Europe (1.00)

Genre:

Research Report > Promising Solution (0.66)
Research Report > New Finding (0.48)

Industry:

Health & Medicine > Health Care Technology > Telehealth (1.00)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceAug-4-2025

RL as Regressor: A Reinforcement Learning Approach for Function Approximation

Huang, Yongchao

Standard regression techniques, while powerful, are often constrained by predefined, differentiable loss functions such as mean squared error. These functions may not fully capture the desired behavior of a system, especially when dealing with asymmetric costs or complex, non-differentiable objectives. In this paper, we explore an alternative paradigm: framing regression as a Reinforcement Learning (RL) problem. We demonstrate this by treating a model's prediction as an action and defining a custom reward signal based on the prediction error, and we can leverage powerful RL algorithms to perform function approximation. Through a progressive case study of learning a noisy sine wave, we illustrate the development of an Actor-Critic agent, iteratively enhancing it with Prioritized Experience Replay, increased network capacity, and positional encoding to enable a capable RL agent for this regression task. Our results show that the RL framework not only successfully solves the regression problem but also offers enhanced flexibility in defining objectives and guiding the learning process.

machine learning, prediction, reinforcement learning, (15 more...)

2508.00174

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (0.61)

arXiv.org Artificial IntelligenceMar-29-2025

SalesRLAgent: A Reinforcement Learning Approach for Real-Time Sales Conversion Prediction and Optimization

M, Nandakishor

Current approaches to sales conversation analysis and conversion prediction typically rely on Large Language Models (LLMs) combined with basic retrieval augmented generation (RAG). These systems, while capable of answering questions, fail to accurately predict conversion probability or provide strategic guidance in real time. In this paper, we present SalesRLAgent, a novel framework leveraging specialized reinforcement learning to predict conversion probability throughout sales conversations. Unlike systems from Kapa.ai, Mendable, Inkeep, and others that primarily use off-the-shelf LLMs for content generation, our approach treats conversion prediction as a sequential decision problem, training on synthetic data generated using GPT-4O to develop a specialized probability estimation model. Our system incorporates Azure OpenAI embeddings (3072 dimensions), turn-by-turn state tracking, and meta-learning capabilities to understand its own knowledge boundaries. Evaluations demonstrate that SalesRLAgent achieves 96.7% accuracy in conversion prediction, outperforming LLM-only approaches by 34.7% while offering significantly faster inference (85ms vs 3450ms for GPT-4). Furthermore, integration with existing sales platforms shows a 43.2% increase in conversion rates when representatives utilize our system's real-time guidance. SalesRLAgent represents a fundamental shift from content generation to strategic sales intelligence, providing moment-by-moment conversion probability estimation with actionable insights for sales professionals.

large language model, machine learning, reinforcement learning, (16 more...)

2503.23303

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Neural Information Processing SystemsFeb-7-2025, 16:16:47 GMT

Review for NeurIPS paper: TorsionNet: A Reinforcement Learning Approach to Sequential Conformer Search

Weaknesses: Because the idea is new and very interesting, a number of topics can up that could/should be addressed. Is there a way to be certain that the gradient descent using MMFF has the molecule stay on the same basin of the PES that the rigid rotor sampled? It is likely, particularly in crowded conformations that the structure and energy that MMFF reports are not for the same internal angles as the initial torsion angles would suggest. The Gibbs Score is introduced as some completely new idea, but it's essentially related to a (relative) population according to Maxwell Boltzmann statistics. Furthermore, the log of Gibbs score is then a relative free energy, a very intuitive connection with the underlying physics.

conformer, free energy, torsionnet, (8 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.85)

Neural Information Processing SystemsFeb-7-2025, 16:16:39 GMT

Review for NeurIPS paper: TorsionNet: A Reinforcement Learning Approach to Sequential Conformer Search

The reviewers found this paper to be interesting and compelling, nicely summarized by R2 in discussion: think the method is sound and exciting and the key challenges in transferability live in the availability of (high-accuracy) training data and in the challenges of representation learning for molecules (GCNs need to be exposed to a lot of chemical variability to be able to interpolate in chemical space.). The alkanes are essentially the same bond over and over and lignin is trained and tested in the same chemical space. I insist that these are representation learning challenges to be solved by the community and improvements there could be combined with this RL approach." That said, the reviewers did find several areas where the paper can be improved. Because of space limitations, I understand that not all of these suggestions will be able to be incorporated within page limits, but I do expect the authors will address as much as possible within the main final text, and all feedback addressed either in main text or in a supplementary appendix.

neurips paper, reinforcement learning approach, sequential conformer search, (4 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.85)

Azfar, Talha, Huang, Kaicong, Ke, Ruimin

Enhancing Disaster Resilience with UAV-Assisted Edge Computing: A Reinforcement Learning Approach to Managing Heterogeneous Edge Devices

arXiv.org Artificial IntelligenceJan-25-2025

Edge sensing and computing is rapidly becoming part of intelligent infrastructure architecture leading to operational reliance on such systems in disaster or emergency situations. In such scenarios there is a high chance of power supply failure due to power grid issues, and communication system issues due to base stations losing power or being damaged by the elements, e.g., flooding, wildfires etc. Mobile edge computing in the form of unmanned aerial vehicles (UAVs) has been proposed to provide computation offloading from these devices to conserve their battery, while the use of UAVs as relay network nodes has also been investigated previously. This paper considers the use of UAVs with further constraints on power and connectivity to prolong the life of the network while also ensuring that the data is received from the edge nodes in a timely manner. Reinforcement learning is used to investigate numerous scenarios of various levels of power and communication failure. This approach is able to identify the device most likely to fail in a given scenario, thus providing priority guidance for maintenance personnel. The evacuations of a rural town and urban downtown area are also simulated to demonstrate the effectiveness of the approach at extending the life of the most critical edge devices.

artificial intelligence, machine learning, reinforcement learning, (10 more...)

2501.15305

Country:

North America > United States > New York > Rensselaer County > Troy (0.04)
Asia > Middle East > Yemen > Amanat Al Asimah > Sanaa (0.04)

Genre: Research Report (0.64)

Industry:

Information Technology (1.00)
Energy > Energy Storage (0.46)
Energy > Power Industry (0.34)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Neural Information Processing SystemsJan-13-2025, 14:11:25 GMT

TorsionNet: A Reinforcement Learning Approach to Sequential Conformer Search

reinforcement learning approach, sequential conformer search, torsionnet, (1 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Li, Benny Bao-Sheng, Wu, Elena, Yang, Hins Shao-Xuan, Liang, Nicky Yao-Jin

Optimizing Low-Speed Autonomous Driving: A Reinforcement Learning Approach to Route Stability and Maximum Speed

arXiv.org Artificial IntelligenceDec-19-2024

Autonomous driving has garnered significant attention Reinforcement Learning (RL) has become a powerful in recent years, especially in optimizing vehicle approach for addressing complex decision-making performance under varying conditions. This paper challenges in autonomous systems, particularly in addresses the challenge of maintaining maximum low-speed scenarios. Unlike high-speed driving, lowspeed speed stability in low-speed autonomous driving environments demand high precision, safety, while following a predefined route. Leveraging and stability [7] due to dynamic obstacles and confined reinforcement learning (RL), we propose a novel approach spaces. This paper explores several applications to optimize driving policies that enable the of RL in low-speed contexts, demonstrating its potential vehicle to achieve near-maximum speed without compromising to enhance performance in various tasks.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

doi: 10.48550/arXiv.2412.16248

2412.16248

Country:

North America > United States > Massachusetts > Suffolk County > Boston (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Genre:

Research Report (1.00)
Overview > Innovation (0.34)

Industry:

Transportation > Ground > Road (1.00)
Automobiles & Trucks (1.00)
Information Technology > Robotics & Automation (0.83)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)